NSF PAR Search | NSF Public Access Repository

Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits

Lin, Jiabin; Moothedath, Shana; Vaswani, Namrata (June 2024, Proceedings of the 41st International Conference on Machine Learning (ICML) 2024)

We study how representation learning can im- prove the learning efficiency of contextual bandit problems. We study the setting where we play T contextual linear bandits with dimension d si- multaneously, and these T bandit tasks collec- tively share a common linear representation with a dimensionality of r ≪ d. We present a new algorithm based on alternating projected gradi- ent descent (GD) and minimization estimator to recover a low-rank feature matrix. Using the pro- posed estimator, we present a multi-task learning algorithm for linear contextual bandits and prove the regret bound of our algorithm. We presented experiments and compared the performance of our algorithm against benchmark algorithms

Full Text Available

Search for: All records